Trendy Names


In order to see the trend of the names which are present in the table, we have written a function. Here we test two names, “Steve” and “Barbara”. we can see that both of them has been trendy in the past but in the recent years there has been a steep decline in the usage of these names. We might try and see if we can find a name which has been trending recently?

---
title: "Project 1: Exploring 100+ Years of US Baby Names"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
---

```{r setup, include = FALSE}
library(tidyverse)
library(flexdashboard)
FILE_NAME <- here::here("data/names.csv.gz")
tbl_names <- readr::read_csv(FILE_NAME, show_col_types = FALSE)
knitr::opts_chunk$set(
  fig.path = "img/",
  fig.retina = 2,
  fig.width = 6,
  fig.asp = 9/16,
  fig.pos = "t",
  fig.align = "center",
  # dpi = if (knitr::is_latex_output()) 72 else 150,
  out.width = "100%",
  # dev = "svg",
  dev.args = list(png = list(type = "cairo-png")),
  optipng = "-o1 -quiet"
)
ggplot2::theme_set(ggplot2::theme_gray(base_size = 8))
```

### Popular Names

```{r results = "hide"}
# PASTE BELOW >> CODE FROM question-1-transform
tbl_names_popular = tbl_names |> 
  # Keep ROWS for year > 2010 and <= 2020
  filter(year > 2010, year <= 2020) |> 
  # Group by sex and name
  group_by(sex, name) |> 
  # Summarize the number of births
  summarize(
    nb_births = sum(nb_births),
    .groups = "drop"
  ) |> 
  # Group by sex 
  group_by(sex) |>  
  # For each sex, keep the top 5 rows by number of births
  slice_max(nb_births, n = 5)

tbl_names_popular

```


```{r}
# PASTE BELOW >> CODE FROM question-1-plot BELOW
tbl_names_popular |> 
  # Reorder the names by number of births
  mutate(name = fct_reorder(name, nb_births)) |>
  # Initialize a ggplot for name vs. nb_births
  ggplot(aes(x = nb_births, y = name)) +
  # Add a column plot layer
  geom_col() +
  # Facet the plots by sex
  facet_wrap(~ sex, scales = "free_y") +
  # Add labels (title, subtitle, caption, x, y)
  labs(
    title = 'number of births',
    subtitle = 'based on names',
    caption = 'names vs. number of births',
    x = 'number of births',
    y = 'names'
  ) +
  # Fix the x-axis scale 
  scale_x_continuous(
    labels = scales::unit_format(scale = 1e-3, unit = "K"),
    expand = c(0, 0),
  ) +
  # Move the plot title to top left
  theme(
    plot.title.position = 'plot'
  )

```

***

<!-- Add a note to be included in the sidebar -->
In this visualization which shows popular names based on sex we can see that in all of the years, Emma and Noah have been the most famous names.
Although we can see that in the Male group, Naoh and Liam are really close.

### Trendy Names

```{r results = "hide"}
# PASTE BELOW >> CODE FROM question-2-transform 
tbl_names_popular_trendy = tbl_names |> 
  # Group by sex and name
  group_by(sex, name) |> 
  # Summarize total number of births and max births in a year
  summarize(
    nb_births_total = sum(nb_births),
    nb_births_max = max(nb_births),
    .groups = "drop"
  ) |> 
  # Filter for names with at least 10000 births
  filter(nb_births_total > 10000) |> 
  # Add a column for trendiness computed as ratio of max to total
  mutate(trendiness = nb_births_max / nb_births_total) |> 
  # Group by sex
  group_by(sex) |> 
  # Slice top 5 rows by trendiness for each group
  slice_max(trendiness, n = 5)

```


```{r}
# PASTE BELOW >> CODE FROM question-2-visualize
plot_trends_in_name <- function(my_name) {
  tbl_names |> 
    # Filter for name = my_name
    filter(name == my_name) |> 
    # Initialize a ggplot of `nb_births` vs. `year` colored by `sex`
    ggplot(aes(x = year, y = nb_births, color = sex)) +
    # Add a line layer
    geom_line() +
    # Add labels (title, x, y)
    labs(
      title = glue::glue("Babies named {my_name} across the years!"),
      x = 'year',
      y = 'number of births'
    ) +
    # Update plot theme
    theme(plot.title.position = "plot")
}
plot_trends_in_name("Steve")
plot_trends_in_name("Barbara")

```

***

<!-- Add a note to be included in the sidebar -->
In order to see the trend of the names which are present in the table, we have written a function.
Here we test two names, "Steve" and "Barbara". we can see that both of them has been trendy in the past but in the recent years there has been a steep decline in the usage of these names.
We might try and see if we can find a name which has been trending recently?

### Popular First letters 

```{r results = "hide"}
# PASTE BELOW >> CODE FROM question-3-transform-1 and question-3-transform-2
tbl_names = tbl_names |> 
  # Add NEW column first_letter by extracting `first_letter` from name using `str_sub`
  mutate(first_letter = str_sub(name, 1, 1)) |>  
  # Add NEW column last_letter by extracting `last_letter` from name using `str_sub`
  mutate(last_letter = str_sub(name, -1, -1)) |> 
  # UPDATE column `last_letter` to upper case using `str_to_upper`
  mutate(last_letter = str_to_upper(last_letter))

tbl_names_by_letter = tbl_names |> 
  # Group by year, sex and first_letter
  group_by(year, sex, first_letter) |> 
  # Summarize total number of births, drop the grouping
  summarize(nb_births = sum(nb_births), .groups = "drop") |> 
  # Group by year and sex
  group_by(year, sex) |> 
  # Add NEW column pct_births by dividing nb_births by sum(nb_births)
  mutate(pct_births = nb_births / sum(nb_births))

```


```{r}
# PASTE BELOW >> CODE FROM question-3-visualize-1
tbl_names_by_letter |> 
  # Filter for the year 2020
  filter(year == 2020) |>
  # Initialize a ggplot of pct_births vs. first_letter
  ggplot(aes(x=first_letter, y=pct_births)) +
  # Add a column layer using `geom_col()`
  geom_col() +
  # Facet wrap plot by sex
  facet_wrap(~sex, scales="free_y") + 
  # Add labels (title, subtitle, x, y)
  labs(
    title = 'distribution of births in year 2020',
    subtitle = 'based on letters',
    x = 'first letters in names',
    y = 'percentage of letters'
  ) +
  # Fix scales of y axis
  scale_y_continuous(
    expand = c(0.25, 0),
    labels = scales::percent_format(accuracy = 1L)
  ) +
  # Update plotting theme
  theme(
    plot.title.position = "plot",
    axis.ticks.x = element_blank(),
    panel.grid.major.x = element_blank()
  )

```

***

<!-- Add a note to be included in the sidebar -->
based on this figure we can see that in a specific year (we have chosen 2020) people had chosen names which starts with certain letters.